43 research outputs found

    DiscoVars: A New Data Analysis Perspective -- Application in Variable Selection for Clustering

    Full text link
    We present a new data analysis perspective to determine variable importance regardless of the underlying learning task. Traditionally, variable selection is considered an important step in supervised learning for both classification and regression problems. The variable selection also becomes critical when costs associated with the data collection and storage are considerably high for cases like remote sensing. Therefore, we propose a new methodology to select important variables from the data by first creating dependency networks among all variables and then ranking them (i.e. nodes) by graph centrality measures. Selecting Top-nn variables according to preferred centrality measure will yield a strong candidate subset of variables for further learning tasks e.g. clustering. We present our tool as a Shiny app which is a user-friendly interface development environment. We also extend the user interface for two well-known unsupervised variable selection methods from literature for comparison reasons.Comment: 13 Pages, Technical Report, Verikar Softwar

    Re-mining item associations: methodology and a case study in apparel retailing

    Get PDF
    Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques

    A framework for visualizing association mining results

    Get PDF
    Association mining is one of the most used data mining tech- niques due to interpretable and actionable results. In this study we pro- pose a framework to visualize the association mining results, specifically frequent itemsets and association rules, as graphs. We demonstrate the applicability and usefulness of our approach through a Market Basket Analysis (MBA) case study where we visually explore the data mining results for a supermarket data set. In this case study we derive several interesting insights regarding the relationships among the items and sug- gest how they can be used as basis for decision making in retailing

    An Integrated Approach for Shift Scheduling and Rostering Problems with Break Times for Inbound Call Centers

    Get PDF
    It may be very difficult to achieve the optimal shift schedule in call centers which have highly uncertain and peaked demand during short time periods. Overlapping shift systems are usually designed for such cases. This paper studies shift scheduling and rostering problems for in bound call centers where overlapping shift systems are used. An integer programming model that determines which shifts to be opened and how many operators to be assigned to these shifts is proposed for the shift scheduling problem. For the rostering problem both integer programming and constraint programming models are developed to determine assignments of operators to all shifts, weekly days-off, and meal and relief break times of the operators. The proposed models are tested on real data supplied by an outsource call center and optimal results are found in an acceptable computation time. An improvement of 15% in the objective function compared to the current situation is observed with the proposed model for the shift scheduling problem. The computational performances of the proposed integer and constraint programming models for the rostering problem are compared using real data observed at a call center and simulated test instances. In addition, benchmark instances are used to compare our Constraint Programming (CP) approach with the existing models. The results of the comprehensive computational study indicate that the constraint programming model runs more efficiently than the integer programming model for the rostering problem. The originality of this research can be attributed to two contributions: (a) a model for shift scheduling problem and two models for rostering problem are presented in detail and compared using real data and (b) the rostering problem is considered as a task-resource allocation and considerably shorter computation times are obtained by modeling this new problem via CP

    Analyzing classified listings at an E-commerce site by using survival analysis

    Get PDF
    Sahibinden.com is a leading e-commerce site in Turkey where sellers (buyers) may advertise their goods (needs) with or without a fee.  Since it generates a large volume of traffic to the classified car listings, the site plays an important role for determining the market value of the used cars. In this study, we first randomly selected 200 car classifieds from 950 new classified ads on the day of February 22, 2012. We then observed these listings on a daily basis for a month to determine the possible updates and deletions of the ads. We assume that if an ad is taken out it means that the car has been sold. In addition to the cars’ features, we observed the posted price and the number of daily views of the ads throughout the data collection. Therefore one can construct survival models to study the effects of the features and price of a car on the life of the ad. In other words, it is possible to study that what features and price levels expedite the sales of used cars

    Re-mining positive and negative association mining results

    Get PDF
    Positive and negative association mining are well-known and extensively studied data mining techniques to analyze market basket data. Efficient algorithms exist to find both types of association, separately or simultaneously. Association mining is performed by operating on the transaction data. Despite being an integral part of the transaction data, the pricing and time information has not been incorporated into market basket analysis so far, and additional attributes have been handled using quantitative association mining. In this paper, a new approach is proposed to incorporate price, time and domain related attributes into data mining by re-mining the association mining results. The underlying factors behind positive and negative relationships, as indicated by the association rules, are characterized and described through the second data mining stage re-mining. The applicability of the methodology is demonstrated by analyzing data coming from apparel retailing industry, where price markdown is an essential tool for promoting sales and generating increased revenue

    Türk hazır giyim sanayi için veri madenciliği tabanlı bir kalıcı indirim yönetim sistemi prototipi

    Get PDF
    Tekstil sektörünün önemli bir alt-sektörü olan hazır giyim sanayindeki eğilim, üretici firmaların aynı zamanda perakendeci olarak da faaliyet göstermeleridir (LC Waikiki, Mavi Jeans, v.b.). Son yıllarda özellikle A.B.D’de perakende sektöründeki karar vericilere yardımcı olmak amacı ile geliştirilen perakende analitiği yazılımları yaygın olarak kullanılmaya başlanmıştır (Lightship Partners, 2009). Yerli hazır giyim perakendecilerimizin yabancı rakipleri ile rekabet edebilmek ve onların önüne geçebilmek için benzer perakende yönetimi karar destek sistemlerine ihtiyaç duymaktadır. Perakende analitiği yazılımların yerine getirdiği en önemli işlevlerden birisi kalıcı indirim eniyilemesidir (markdown optimization). Kalıcı indirim, satış miktarları azalan veya azalmaya yüz tutmuş olan ürünlerin satışlarını arttırmak için yapılan ve ürün fiyatı bir kez indirildikten sonra tekrar indirimli fiyatın üzerine çıkılamayan bir indirim biçimidir. Kalıcı indirimlerin en sık kullanıldığı sektörlerden başında hazır giyim sektörü gelmektedir. Kalıcı indirim eniyilemesi literatüründe yer alan çalışmalar ve pazarda bulunan ticari yazılımlar, kalıcı indirim eniyilemesinde ürünlerin taleplerinin birbirinden bağımsız olduğunu varsaymakta ve ürün talepleri arasındaki fiyata bağlı tamamlayıcı ve ikame etkilerini (çapraz fiyat esneklikleri) göz ardı etmektedir. Oysa ürünler arası ilişkiler ve etkileşimler de kalıcı indirim en iyilenmesinde dikkate alınması gereken önemli bir noktadır. Bu projede, Türkiye’nin en büyük hazır giyim perakendecisi olan LC Waikiki tarafından sağlanan ürün satış bilgileri kullanılarak, veri madenciliği yardımı ile, arasında ikame ve tamamlayıcı etkiler olması muhtemel ürün gruplarını bulan, aynı grupta yer alan ürünlerin fiyatlarına bağlı olarak ürün taleplerini tahmin eden ve yaklaşık dinamik programlama yardımı ile ürün kalıcı indirim oranlarını ve bu oranların zamanlamasını belirleyen bir kalıcı indirim karar destek sistemi prototipi geliştirilmektedir.A major trend in the apparel sector, which is a sub-sector of textile industry, is the entrance of apparel producers into the consumer market as retailers (LC Waikiki, Mavi Jeans, etc.). In recent years, especially in the USA, retails analytics software have gained increased popularity for helping decision makers. Turkish apparel retailers need similar decision support systems to be able to compete with international apparel chains. One of the most significant functions of retail analytics software is markdown optimization, which decides on the level of markdown price for items throughout a season. Markdown is a special type of discount, where the price is monotonically non-decreasing throughout the season. Existing academic research on markdown optimization and business software for retail analytics assume independence between the demands of items, ignoring the complementarity and substitute effects between them. However, such associations and interactions between items are important, and should be taken into account during markdown optimization. In this project, the goal is to construct a methodology and a prototype system for markdown optimization. The developed methodology starts with finding the complementary and substitute products through positive and negative association mining, respectively. Then the demand of each item is forecasted based on the set of items it is associated with. Finally, approximate dynamic programming is used to compute markdown ratios and their timing. The methodology is tested with real world data from LC Waikiki, the largest apparel retail chain in Turkey.Publisher's Versio
    corecore